As IoT networks grow larger and more organizationally entangled, a deceptively simple question keeps resur- facing: who actually owns the data these devices produce? The question is hardest to answer when observations come from many heterogeneous sources, pass through shared infrastructure, and carry meaning that shifts with context. Today’s solutions typically lock ownership in at deployment time—an assumption that breaks down the moment derived data streams blend contributions from multiple stakeholders. We introduce SemOwn, a formally grounded framework that represents IoT data in a typed, attributed knowledge graph and infers ownership on the fly through context-weighted scoring and cooperative game- theoretic attribution. At its core lies a Semantic Ownership Graph Gso = (V, E, ?V , ?E ) and a parametric ownership function O that weaves together spatial relevance, temporal validity, data sensitivity, and governance policy. For observations derived from several sources, we use the Shapley value to divide ownership fairly among contributors, and a role-based conflict resolution protocol settles disputes when stakes are closely matched. We prove that O satisfies efficiency, symmetry, null-player, and Lipschitz stability properties, and we bound the inference com- plexity under Monte Carlo approximation. On a synthetic smart- building dataset of 105 observations, SemOwn reaches an F1- score of 0.91—outstripping static baselines by 31 points and the strongest machine-learning baseline by 12—while matching a domain-expert panel on 89% of multi-owner conflicts, all at sub- 50 ms latency. The framework is designed to support auditable ownership attribution in IoT data marketplaces, regulatory compliance, and federated multi-tenant deployments.
Introduction
Modern Internet of Things (IoT) systems generate massive amounts of machine-produced data from multiple sources such as sensors, devices, tenants, and service providers. A major unresolved issue is determining who owns this data, especially when data is combined or derived from different contributors. Existing intellectual property (IP) laws do not provide a clear definition of ownership for machine-generated IoT data.
Current regulations such as the GDPR (European Union), EU Data Act, CCPA (United States), and India’s DPDP Act mainly focus on data privacy, access, and control rather than ownership rights. Traditional IP laws such as copyright and patents are also insufficient because machine-generated data lacks human creativity or invention. As a result, organizations rely on simple ownership assumptions, such as giving ownership to the device owner or platform operator, which ignores the complexity of multi-source data generation.
The study proposes SemOwn, a computational framework that uses Semantic Web technologies and Knowledge Graphs (KGs) to determine fair ownership distribution of IoT data. The framework combines ontology modeling, contextual reasoning, and cooperative game theory.
Main Contributions of SemOwn:
SemOwn-O Ontology
An OWL 2-DL-based model that represents IoT entities, ownership claims, data sources, and relationships.
Extends existing IoT ontologies such as SSN/SOSA and SAREF.
Ownership Function
Calculates ownership scores based on:
Location relevance
Time validity
Data sensitivity
Governance policies
Shapley Value-Based Attribution
Uses cooperative game theory to divide ownership among multiple contributors.
Assigns fractional ownership based on each contributor’s actual contribution to the final data product.
Conflict Resolution Mechanism
Provides deterministic and auditable rules for resolving ownership disputes.
Problem Addressed:
The research identifies three major challenges:
Legal gap: No global law defines ownership rights for machine-generated IoT data.
Attribution problem: Data created from multiple sources requires fair ownership allocation.
Enforcement problem: Disputes across organizations and countries lack reliable resolution methods.
Technical Approach:
The framework creates a Semantic Ownership Graph containing:
Devices
Users/stakeholders
Observations
Locations
Time information
A context model evaluates each data observation by considering:
How closely the data relates to a stakeholder or location
Whether the data was collected during an active agreement period
Sensitivity of the information
Applicable policies
For combined or derived data, the framework applies Shapley values to measure each contributor’s importance and assign ownership weights that sum to 100%.
Importance of the Study:
The proposed system aims to replace informal ownership assumptions with a transparent, mathematically grounded approach. It provides:
Fair ownership attribution
Traceability of data contributions
Explainable decision-making
Support for future IoT data governance
Conclusion
SemOwn sits at the intersection of computational methods and legal governance for IoT data attribution. By document- ing the persistent absence of harmonised ownership doctrine across major jurisdictions, we make the case for interdisci- plinary solutions that marry legal nuance with mathematical rigour. The framework delivers this through a formal, machine- readable ontology and a game-theoretic attribution engine that can resolve multi-stakeholder disputes in real time. As IoT deployments continue to grow, static, contract-based owner- ship models will struggle to keep pace. The path forward lies in deeper integration of computational attribution with regulatory compliance and dispute-resolution workflows, so that machine-generated data can be attributed transparently, auditably, and in ways that are practically actionable.
References
[1] A. Haller, K. Janowicz, S. Cox, et al., “The modular SSN ontology: A joint W3C and OGC standard specifying the semantics of sensors, observations, sampling, and actuation,” Semantic Web, vol. 10, no. 1, pp. 9–32, 2019.
[2] L. Daniele, F. den Hartog, and J. Roes, “Created in close interaction with the industry: The Smart Appliances REFerence (SAREF) ontology,” in Proc. FOMI Workshop, 2015, pp. 100–112.
[3] A. Gyrard, C. Bonnet, and K. Boudaoud, “Cross-domain internet of things application development: M3 framework and evaluation,” Future Gener. Comput. Syst., vol. 67, pp. 385–404, 2017.
[4] M. Bermudez-Edo, T. Elsaleh, P. Barnaghi, and S. Taylor, “IoT-Lite: A lightweight semantic model for the Internet of Things and its use with dynamic semantics,” Pers. Ubiquitous Comput., vol. 21, no. 3, pp. 475– 487, 2017.
[5] P. Barnaghi, W. Wang, C. Henson, and K. Taylor, “Semantics for the Internet of Things: Early progress and back to the future,” Int. J. Semant. Web Inf. Syst., vol. 8, no. 1, pp. 1–21, 2012.
[6] A. Hogan, E. Blomqvist, M. Cochez, et al., “Knowledge graphs,” ACM Comput. Surv., vol. 54, no. 4, pp. 1–37, 2021.
[7] J. Ren, Y. Guo, D. Zhang, and L. Li, “Knowledge graph embedding with atrous convolution and residual learning for link prediction in industrial IoT,” IEEE Trans. Ind. Inf., vol. 17, no. 8, pp. 5737–5745, 2021.
[8] K. Janowicz, A. Haller, S. Cox, D. Le Phuoc, and M. Lefranc¸ois, “SOSA: A lightweight ontology for sensors, observations, samples, and actuators,” J. Web Semant., vol. 56, pp. 1–10, 2019.
[9] J. Akroyd, J. Mosbach, A. Shermon, and M. Kraft, “Universal digital twin—a dynamic knowledge graph,” Data-Centric Eng., vol. 2, p. e14, 2021.
[10] P. Hummel, M. Braun, and P. Dabrock, “Data sovereignty: A review,” Big Data Soc., vol. 8, no. 1, 2021.
[11] X. Liang, S. Shetty, D. Tosh, C. Kamhoua, K. Kwiat, and L. Njilla, “ProvChain: A blockchain-based data provenance architecture in cloud environment with enhanced privacy and availability,” in Proc. ACM SysTEX, 2017, pp. 468–477.
[12] S. Pal, T. Rabehaja, A. Hill, M. Hitchens, and V. Varadharajan, “On the integration of blockchain to the Internet of Things for enabling access right delegation,” IEEE Internet Things J., vol. 8, no. 7, pp. 5765–5776, 2021.
[13] R. C. Fernandez, P. Subramaniam, and M. J. Franklin, “Data market platforms: Trading data assets to solve data problems,” Proc. VLDB Endowment, vol. 13, no. 11, pp. 1933–1947, 2020.
[14] P. A. Bonatti, S. Kirrane, I. Petrova, and L. Sauro, “Machine- understandable policies and GDPR compliance checking,” KI— Ku¨nstliche Intelligenz, vol. 34, no. 3, pp. 303–315, 2020.
[15] L. S. Shapley, “A value for n-person games,” in Contributions to the Theory of Games II (H. W. Kuhn and A. W. Tucker, eds.), pp. 307–317, Princeton Univ. Press, 1953.
[16] I. Horrocks, P. F. Patel-Schneider, H. Boley, S. Tabet, B. Grosof, and M. Dean, “SWRL: A Semantic Web Rule Language combining OWL and RuleML,” W3C Member Submission, 2004.
[17] M. Nickel, K. Murphy, V. Tresp, and E. Gabrilovich, “A review of relational machine learning for knowledge graphs,” Proc. IEEE, vol. 104, no. 1, pp. 11–33, 2016.
[18] H. Zech, “A legal framework for a data economy in the European Digital Single Market: Rights to use data,” J. Intell. Prop., Inf. Technol. Electron. Commer. Law, vol. 7, pp. 460–470, 2016.
[19] A. Mu¨hle, A. Gru¨ner, T. Gayvoronskaya, and C. Meinel, “A survey on essential components of a self-sovereign identity,” Comput. Sci. Rev., vol. 30, pp. 80–86, 2018.
[20] C. Bizer, T. Heath, and T. Berners-Lee, “Linked Data: The story so far,” in Semantic Services, Interoperability and Web Applications, pp. 205– 227, IGI Global, 2011.
[21] W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu, “Edge computing: Vision and challenges,” IEEE Internet Things J., vol. 3, no. 5, pp. 637–646, 2016.
[22] J. Castro, D. Go´mez, and J. Tejada, “Polynomial calculation of the Shapley value based on sampling,” Comput. Oper. Res., vol. 36, no. 5, pp. 1726–1730, 2009.
[23] R. Agrawal, J. Kiernan, R. Srikant, and Y. Xu, “Hippocratic databases,” in Proc. 28th VLDB Conf., 2002, pp. 143–154.
[24] D. Tosh, S. Shetty, X. Liang, C. Kamhoua, and L. Njilla, “Consensus protocols for blockchain-based data provenance: Challenges and op- portunities,” in IEEE 8th Annu. Ubiquitous Comput., Electron. Mobile Commun. Conf., 2019, pp. 5440–5455.
[25] J. Soldatos, N. Kefalakis, M. Serrano, and M. Hauswirth, “Design princi- ples for utility-driven services and cloud-based computing modelling for the Internet of Things,” Int. J. Web Grid Serv., vol. 11, no. 1, pp. 13–25, 2015.
[26] T. M. Cover and J. A. Thomas, Elements of Information Theory, 2nd ed. Hoboken, NJ: Wiley-Interscience, 2006.
[27] A. Ghorbani and J. Zou, “Data Shapley: Equitable valuation of data for machine learning,” in Proc. 36th Int. Conf. Mach. Learn. (ICML), 2019, pp. 2242–2251.
[28] V. Fo¨ldi, S. Schiavon, and others, “The ASHRAE Global Thermal Comfort Database II,” Build. Environ., vol. 142, pp. 502–512, 2018.
[29] M. I. Belghazi, A. Barber, S. Rajeshwar, S. Mohamed, et al., “Mutual Information Neural Estimation,” in Proc. 35th Int. Conf. Mach. Learn. (ICML), 2018, pp. 531–540.
[30] B. Rozemberczki, L. Watson, P. Bayer, H.-T. Yang, O. Kiss, S. Nilsson, and R. Sarkar, “The Shapley value in machine learning,” arXiv preprint arXiv:2202.05594, 2022.
[31] R. Jia, D. Dao, B. Wang, F. A. Hubis, N. Hynes, N. M. Gurel, B. Li, C. Zhang, and D. Song, “Towards efficient data valuation based on the Shapley value,” arXiv preprint arXiv:1902.10275, 2023.
[32] S. Zhang, G. Bai, H. Li, P. Liu, M. Zhang, and S. Li, “Multi-source knowledge reasoning for data-driven IoT security,” Sensors, vol. 21, no. 22, p. 7579, 2021.
[33] M. Bienvenu, D. Figueira, and P. Lafourcade, “Shapley value compu- tation in ontology-mediated query answering (extended abstract),” in Proc. Int. Joint Conf. Artif. Intell. Sister Conferences Best Papers, 2025, pp. 10875–10880.